Evaluating the Link between Word Frequencies and Pronunciation Variants: a Cross-lingual Study on Read and Spontaneous Speech

نویسندگان

  • L. Lamel
  • M. Adda-Decker
چکیده

The aim of this contribution is twofold: evaluating the use of pronunciation variants in read and spontaneous speech and studying the link between word frequencies and pronunciation variants. The dependance of pronunciation variants on a given system connguration is also addressed in the rst part. For the second aspect of this work diierent variant types are deened. A cross-lingual study is carried out for both read and spontaneous speech in French and American English using the following corpora: BREF 2], MASK 3], WSJ 4], ARPA-HUB4 5]. 1 Evaluating the use of pronunciation variants Adding pronunciation variants in a recognition system's lexicon is a means of increasing acoustic modeling options for these words. The additional variants are expected to improve the recognizer's decoding accuracy provided they concern potential error regions. However, if the type of variants is inappropriate (not relevant) with respect to the recognizer's weakness or if the number of variants is too high the overall recognizer's performance may decrease. How many times were the new pronunciation variants, which were added to solve a given decoding problem, globally ineeective? While solving the problem for which they were designed, the variants may introduce new errors elsewhere, canceling the local beneet. As variants often increase the homophone rates they may be potential error sources. Furthermore, a large increase in the number of variants can decrease decoding performance in terms of computational requirements. Variants are thus introduced carefully in our speech recognition systems. In this contribution we address the use of pronunciation variants during speech/transcription alignment in diierent system conngurations (using diierent transcription lexica, and diierent acoustic model sets). The aim is to evaluate the number and type of the observed pronunciation variants as a function of a particular system connguration. For this study three (four) pronunciation lexica are used (corresponding to the same orthographic word list):-LEX1 : standard lexicon for LVSR (5-10% of variants)-LEX1' : no variants (full form pronunciations of LEX1)-LEX2 : lexicon with a large number of variants (40-60%).-LEX3 : lexicon with a very large number of variants (80-100%). Variants in LEX2 and LEX3 are derived semi-automatically from phone recognition experiments. LEX1' without variants or LEX1 (standard recognition lexicon) can be used as reference lexicon to measure the occurrence rate of additional variants from LEX2 and LEX3. Acoustic models are trained for each lexicon. For a given lexicon (LEX2, LEX3), alignment is carried out using the diierent acoustic model sets …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pronunciation variants across system configuration, language and speaking style

This contribution aims at evaluating the use of pronunciation variants for di erent recognition system con gurations, languages and speaking styles. This study is limited to the use of variants during speech alignment, given an orthographic transcription of the utterance and a phonemically represented lexicon, and is thus focused on the modeling capabilities of the acoustic word models. To meas...

متن کامل

Pronunciation Variants Across Systems, Languages and Speaking Style

This contribution aims at evaluating the use of pronunciation variants across different system configurations, languages and speaking styles. This study is limited to the use of variants during speech alignment, given an orthographic transcription and a phonemically represented lexicon, thus focusing on the modeling abilities of the acoustic word models. Parallel and sequential variants are tes...

متن کامل

Pronunciation variant analysis using speaking style parallel corpus

To improve the recognition accuracy for spontaneous conversational speech, we collected a corpus to study how spontaneous conversational speech differs from read style speech. The corpus consists of two parts: 1) spontaneous conversational speech and 2) read speech with the same word transcriptions as the conversational speech. In word and phone recognition experiments, it was confirmed that, f...

متن کامل

Adapting the acoustic model of a speech recognizer for varied proficiency non-native spontaneous speech using read speech with language-specific pronunciation difficulty

This paper presents a novel approach to acoustic model adaptation of a recognizer for non-native spontaneous speech in the context of recognizing candidates’ responses in a test of spoken English. Instead of collecting and then transcribing spontaneous speech data, a read speech corpus is created where non-native speakers of English read English sentences of different degrees of pronunciation d...

متن کامل

Comparison between Expert Listeners and Continuous Speech Recognizers in Selecting Pronunciation Variants

In this paper, the performance of an automatic transcription tool corpus is by modeling pronunciation variation [2]. is evaluated. The transcription tool is a continuous speech Another way of obtaining models which are less recognizer (CSR) which can be used to select pronunciation contaminated is to train PMs on read speech. It is well known variants (i.e. detect insertions and deletions of ph...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007